Skip to content

fix(site): Skip file-tree walk when refreshing only database usage#6765

Open
balamurali27 wants to merge 3 commits into
developfrom
fix/database-usage-skip-file-walk
Open

fix(site): Skip file-tree walk when refreshing only database usage#6765
balamurali27 wants to merge 3 commits into
developfrom
fix/database-usage-skip-file-walk

Conversation

@balamurali27

@balamurali27 balamurali27 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Problem

refresh_database_usage (polled by the dashboard's Site Overview / Database Analyzer pages every few seconds) goes through the generic sync_info, which calls the agent's get_usage. That recursively walks the site's entire file tree (public, private, backups) just to sum file sizes — even though the caller only wants the database size.

On sites with large file trees the walk exceeds the gunicorn timeout and kills agent web workers (WORKER TIMEOUT). Because the dashboard re-polls every ~3s, a single bloated site keeps the walk running back-to-back.

Fix

Thread a database_only flag through get_site_infofetch_infosync_info:

  • In that mode the agent is asked (via ?database_only=1) for just the cheap database size.
  • _sync_database_usage records a Site Usage row that carries the last-known file totals forward instead of recomputing them, so disk-usage reporting is unaffected.
  • Both DB-refresh callers (the refresh_database_usage parser-off branch and the process_refresh_database_usage_job_update callback) pass database_only=True.

Deploy note

Paired PR: frappe/agent#548

Requires the matching agent change to honour ?database_only (frappe/agent#548). Without it the agent ignores the param and falls back to the full walk (current behaviour), so the two can be deployed in either order safely.

Test

Added test_sync_info_database_only_refreshes_db_size_and_carries_files_forward — asserts a database-only sync refreshes the DB size while carrying public/private/backups forward, and that fetch_info is called with database_only=True.

🤖 Generated with Claude Code

A database-usage refresh went through the generic sync_info, which calls
the agent's get_usage — and that recursively walks the site's entire
file tree (public, private, backups) just to sum file sizes. On sites
with large file trees the walk exceeds the gunicorn timeout, killing
agent web workers. The dashboard polls refresh_database_usage every few
seconds, so a single bloated site keeps the walk running back-to-back.

Threading a database_only flag through get_site_info -> fetch_info ->
sync_info fixes this: in that mode the agent is asked (via
?database_only=1) for just the cheap database size, and
_sync_database_usage records a Site Usage row that carries the
last-known file totals forward instead of recomputing them.

Requires the matching agent change to honour ?database_only; without it
the agent ignores the param and falls back to the full walk (current
behaviour), so the two can be deployed in either order safely.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Confidence Score: 4/5

Safe to merge with the understanding that database_free and database_free_tables will be silently reset on each dashboard poll if the agent's database-only response omits them.

_sync_database_usage explicitly carries public, private, and backups forward from the last Site Usage row, but applies bare .get(..., 0) / .get(..., []) defaults for database_free and database_free_tables — meaning if the agent returns only the base database size (as described in the PR), fragmentation data is wiped on every 3-second poll. The fix is minimal (fall back to last values), but until it lands, sites that had meaningful fragmentation data in prior full syncs will show 0 / empty on the dashboard indefinitely.

press/press/doctype/site/site.py — specifically the _sync_database_usage method's handling of database_free and database_free_tables.

Important Files Changed

Filename Overview
press/agent.py Adds database_only parameter to get_site_info, appending ?database_only=1 to the agent URL path when set — straightforward and correct.
press/press/doctype/site/site.py Introduces _sync_database_usage and threads database_only through sync_info/fetch_info; database_free and database_free_tables are not carried forward from the last known state when the agent omits them in database-only mode, silently resetting fragmentation data on each poll.
press/press/doctype/site/test_site.py Adds a focused test asserting database size is updated and file sizes are carried forward in database-only mode; covers the happy path well.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Dashboard
    participant Site as site.py
    participant Agent as agent.py
    participant DB as Database

    Dashboard->>Site: refresh_database_usage()
    alt schema parser disabled
        Site->>Site: "sync_info(database_only=True)"
        Site->>Agent: "get_site_info(site, database_only=True)"
        Agent->>Agent: "GET /info?database_only=1"
        Agent-->>Site: "{usage: {database: X}}"
        Site->>Site: _sync_database_usage(usage)
        Site->>DB: get_last_doc("Site Usage")
        Note over Site: carries public/private/backups forward
        Site->>DB: insert Site Usage row
        Site-->>Dashboard: "{synced: True}"
    else schema parser enabled
        Site->>Site: (schema parser path, unchanged)
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Dashboard
    participant Site as site.py
    participant Agent as agent.py
    participant DB as Database

    Dashboard->>Site: refresh_database_usage()
    alt schema parser disabled
        Site->>Site: "sync_info(database_only=True)"
        Site->>Agent: "get_site_info(site, database_only=True)"
        Agent->>Agent: "GET /info?database_only=1"
        Agent-->>Site: "{usage: {database: X}}"
        Site->>Site: _sync_database_usage(usage)
        Site->>DB: get_last_doc("Site Usage")
        Note over Site: carries public/private/backups forward
        Site->>DB: insert Site Usage row
        Site-->>Dashboard: "{synced: True}"
    else schema parser enabled
        Site->>Site: (schema parser path, unchanged)
    end
Loading

Reviews (2): Last reviewed commit: "Merge branch 'develop' into fix/database..." | Re-trigger Greptile

Comment on lines +2114 to +2116
self._insert_site_usage(
{
"database": fetched_usage["database"],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 _sync_database_usage accesses fetched_usage["database"] with a hard key lookup. If the agent returns a response that passes the if not data guard but has a missing or renamed "database" key (e.g., during a partial rollout of the matching agent change), this raises an unhandled KeyError in the refresh_database_usage code path, where there is no suppress(Exception) wrapper — surfacing a 500 to the polling dashboard every 3 s.

Suggested change
self._insert_site_usage(
{
"database": fetched_usage["database"],
database_size = fetched_usage.get("database")
if database_size is None:
return
self._insert_site_usage(
{
"database": database_size,
Prompt To Fix With AI
This is a comment left during a code review.
Path: press/press/doctype/site/site.py
Line: 2114-2116

Comment:
`_sync_database_usage` accesses `fetched_usage["database"]` with a hard key lookup. If the agent returns a response that passes the `if not data` guard but has a missing or renamed `"database"` key (e.g., during a partial rollout of the matching agent change), this raises an unhandled `KeyError` in the `refresh_database_usage` code path, where there is no `suppress(Exception)` wrapper — surfacing a 500 to the polling dashboard every 3 s.

```suggestion
		database_size = fetched_usage.get("database")
		if database_size is None:
			return
		self._insert_site_usage(
			{
				"database": database_size,
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread press/press/doctype/site/site.py Outdated

@frappe.whitelist()
def sync_info(self, data=None):
def sync_info(self, data=None, database_only=False):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 database_only is a raw bool default on a @frappe.whitelist() method. When invoked via the HTTP API with a query-string value such as database_only=False, Frappe passes it as the string "False", which is truthy in Python. This would silently take the fast _sync_database_usage path (skipping config/timezone sync) even when the caller intended the full sync. Consider coercing with frappe.utils.sbool(database_only) or frappe.utils.cint(database_only) at the top of the method.

Prompt To Fix With AI
This is a comment left during a code review.
Path: press/press/doctype/site/site.py
Line: 2202

Comment:
`database_only` is a raw `bool` default on a `@frappe.whitelist()` method. When invoked via the HTTP API with a query-string value such as `database_only=False`, Frappe passes it as the string `"False"`, which is truthy in Python. This would silently take the fast `_sync_database_usage` path (skipping config/timezone sync) even when the caller intended the full sync. Consider coercing with `frappe.utils.sbool(database_only)` or `frappe.utils.cint(database_only)` at the top of the method.

How can I resolve this? If you propose a fix, please make it concise.

Over the HTTP API a query-string value like database_only=False arrives
as the string "False", which is truthy and would wrongly take the
fast _sync_database_usage path. Annotating the parameter as bool lets
Frappe's whitelist argument coercion convert it via sbool.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
@balamurali27

Copy link
Copy Markdown
Contributor Author

recent f1-ksa stuck jobs incident

@codecov-commenter

codecov-commenter commented Jun 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 75.00000% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.70%. Comparing base (b56a7c7) to head (da8b5b8).
⚠️ Report is 9 commits behind head on develop.

Files with missing lines Patch % Lines
press/agent.py 20.00% 4 Missing ⚠️
press/press/doctype/site/site.py 75.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #6765       +/-   ##
============================================
- Coverage    62.94%   50.70%   -12.24%     
============================================
  Files          117      994      +877     
  Lines        18110    83816    +65706     
  Branches       527      527               
============================================
+ Hits         11399    42498    +31099     
- Misses        6678    41285    +34607     
  Partials        33       33               
Flag Coverage Δ
dashboard 62.94% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants